Passage Retrieval API

The API for enterprises who want GenAI grounded in trusted content

What it does: Lets GenAI apps retrieve the right passages from your enterprise content to feed your LLMs and custom apps

See Coveo in Action

Welcome to the Coveo Passage Retrieval API demo. I'm gonna show you how a user query triggers the Passage Retrieval API and flows through to enhance the responses of different large language models. The Passage Retrieval API is the perfect solution for organizations considering building their own AI applications or leveraging AI agents because it takes care of the most complex, resource intensive, and risky part, retrieving accurate information to feed the LLM or AI agent. I'm gonna show you how easy it is to use this API to enhance the accuracy, relevance, and security of any generative AI experience, reducing hallucinations, protecting sensitive data, and significantly accelerating time to value. Right now you can see in the configuration window that we're using an existing query pipeline in our demo AI, Barca Group, which is a boating equipment manufacturer selling direct to consumers via retail locations and online. Let's start with a common scenario, entering a query like boat battery problem. The API instantly retrieves relevant text chunks, not full documents, along with source links and relevance scores, so your AI applications have reliable, current, and citable information for generated responses. Next, we see that these chunks are fed into a large language model AI ChatGPT or Gemini using a custom prompt, which for this use case would be managed by you, the customer. For instance, you might want a standard customer service tone in one application and a playful sassy pirate tone in another. Here in our testing tool, we are comparing multiple LLMs side by side to see how each one interprets the same source information. With the Passage Retrieval API, Coveo takes care of the complex retrieval of knowledge and you, the customer, can choose the LLM and its settings as well as tune the prompts to deliver exactly the kind of answers you want for your specific use case and audience. This is what makes passage retrieval so powerful. You have full control over the UI, prompt engineering, LLM choice, and its parameters. And we at Coveo make the indexing and retrieval easy and precise, especially for those who already have a Coveo index already implemented. You can create new query pipelines or use existing ones and still have the ability to fine tune them by applying weightings and business rules or leveraging machine learning like automatic relevance tuning. If your organization has unique compliance or security rules, those constraints are already handled by the Coveo search pipeline before anything is sent to the LLM. You could add any sort of metadata into the context of the query, from user access permissions to their role, country, and language, or even interaction data to signal intent. Now, the studio environment I showed you is just a testing tool we've developed internally as a way to show how the API works. But in a real world scenario, you'd of course leverage the API in your own applications and testing tools. By building Agentic apps using Coveo, you tap into the power of the Coveo AI Relevance platform, leveraging our connectors to third party data sources, hybrid ranking, security features, and decades of building enterprise grade search and retrieval and machine learning models. By delivering precise, trustworthy text to LLMs or AI agents like Salesforce Agent Force, Microsoft Copilot, Amazon Q, and SAP JUUL, you will boost accuracy, minimize hallucinations, and meet enterprise security requirements, all while improving deployment speed and reducing costs. Plus, with a unified index and multiple custom and managed solutions available to leverage it, you can scale multiple generative AI applications consistently using a single reliable retrieval method. Thanks for watching. I hope this gave you a clear idea of how the Passage Retrieval API can power fast, accurate, and highly customizable AI driven experiences tailored to your brand, your content, and your users.

Coveo Passage Retrieval API

Ground custom generative AI applications in relevant enterprise content without building your own search and retrieval system. Improve accuracy, reduce hallucinations, & meet enterprise security standards with a single retrieval method for your retrieval-augmented generation (RAG) framework that can scale across your enterprise. This headless API solution delivers precise text passages to your large language model (LLM) from a single unified index of all your enterprise knowledge — ensuring trust & relevance in every answer.

Custom GenAI Experiences

Improve the reliability and security of any generative AI application by providing your LLM with the most relevant passages from unified enterprise content sources.

Chatbots & Virtual Assistants

Improve the relevance of the answers from your chatbots & virtual assistants so your users experience personalized, context-rich conversations.

Internal Knowledge Management

Empower employees with personalized, accurate answers grounded in enterprise knowledge, reducing search time and boosting productivity.

Customer-Facing Support & Case Deflection

Provide precise self-service answers cited from support knowledge that minimize escalations, lower support costs, and boost customer satisfaction.

0 %

increase in generated answer accuracy

0 %

increase in article & passage retrieval accuracy

0 weeks

time-to-value

Get Accurate, Trusted Retrieval

Ground large language models (LLMs) such as ChatGPT, Einstein, Gemini, and Claude with highly relevant, targeted text passages for more trustworthy, context-rich answers.
Improve relevance with every user interaction by leveraging Coveo’s leading enterprise search platform, including the unified hybrid index (semantic + lexical) and mature machine learning models like Automatic Relevance Tuning (ART).

Accelerate GenAI deployments

Shorten development cycles and time-to-market for Gen AI applications while driving consistent performance.
Connect seamlessly with Agentic AI platforms such as Salesforce Agentforce, Microsoft Copilot, Amazon Bedrock Agents to improve accuracy and performance.
Reduce the time and resources needed to build and maintain retrieval pipelines across multiple projects and departments by leveraging existing Coveo query pipelines and a unified index.

One Unified Index for All Your GenAI Needs

Consolidate content from CRMs, ERPs, ITSM, CMS, and other sources into a security-aware, high-performing index using out-of-the-box connectors.
Avoid duplicate indexing tasks and infrastructure while supporting a wide range of generative AI, search, recommendation, and discovery use cases.
Scale easily to massive data sets and concurrent users or more applications leveraging the same index and repurposing existing query pipelines.

PRODUCT SHEET

Information Retrieval for Generative AI at Scale

PRODUCT SHEET

Information Retrieval for Generative AI at Scale

Build trusted generative AI with enterprise-grade information retrieval with Coveo’s Passage Retrieval API.

Get the product sheet

Billion-dollar enterprises rely on Coveo AI‑Search

Industry-leading Relevance

Combines semantic & lexical search, ML-powered personalization, and control over weighting factors and business rules to continuously improve the relevance of passages delivered to your LLM.

Strong Integration Ecosystem

Links seamlessly to major LLM platforms, chatbot frameworks, countless knowledge sources (like ServiceNow, AEM, Salesforce, Sitecore), & existing Coveo deployments, optimizing total cost of ownership & operational efficiency.

Security & Compliance

Protect data at scale with a unified hybrid index that enforces strict permissions, encryption, and compliance for secure, consistent access across all GenAI experiences.

Customization & Control

Permits full flexibility in choosing LLMs, refining prompts, customizing the UI & tailoring business rules—catering to a diverse range of use cases & specialized GenAI tasks.

Generative Answering FAQ

Scaling Generative AI projects across an enterprise comes with several challenges. A common issue is the fragmentation caused by multiple teams running siloed proof-of-concept (POC) projects. These disconnected efforts often lead to duplicated work, inconsistent data pipelines, and fractured knowledge bases.

Security and compliance are also significant hurdles, as enterprise data requires strict access controls and must adhere to regulatory standards—making it cumbersome to create and maintain secure pipelines.

Additionally, integrating diverse data sources with varying formats and APIs is complex, requiring ongoing maintenance to prevent stale data or connection issues.

Finally, achieving scalability is difficult, as production systems must handle millions of users and documents while maintaining high performance and reliability.

Enterprise Reality: Deploying GenAI for real business use cases is far more complex than a polished proof-of-concept or quick demonstration. While many have invested in training their own LLMs, security, scalability, and data quality are ongoing concerns.

The Retrieval Gap: Retrieval Augmented Generation (RAG) is widely touted as the way to ground LLMs in an organization’s data, but information retrieval is harder than organizations expect. Many teams underestimate the operational overhead of connecting to, securing, and unifying data sources at scale and maintaining them.

Retrieval is essential because it ensures Generative AI applications are grounded in accurate, timely, and relevant data. While creating prompts or selecting a large language model (LLM) can be relatively straightforward, feeding the LLM with high-quality data is far more challenging. If the retrieval system is flawed—due to stale data, poor search relevance, or fragmented sources—the AI outputs will be unreliable, leading to “garbage in, garbage out” scenarios. Coveo addresses this by indexing data across silos, ranking results with AI, and enforcing strict security measures, ensuring the LLM receives the best possible information. This robust retrieval foundation is what enables scalable and reliable generative AI applications.

Deciding how much of your generative AI experiences to build varies by customer and use case. It’s common for organizations to deploy a combination of custom and out-of-the-box generative answering solutions.

The reasons for building custom generative AI applications are often to control the front-end user interface, to use a built-for-purpose or enterprise-trained LLM, to tie into a chatbot or agentic workflow, or for more granular control over the prompt engineering. In all of these cases, the information retrieval portion of RAG is the most difficult to create and maintain; this is where the Coveo Passage Retrieval API comes in.

If you’re looking for an out-of-the-box managed solution for service, website, workplace and commerce use cases, check out Coveo Relevance Generative Answering. This product leverages the same retrieval API but offers a plug-and-play generative answering experience with exceptional time-to-value.

To unify and scale Gen AI projects, consolidate efforts around a single, secure information retrieval system. Fragmented POCs often lead to duplicated work, fractured data, and inconsistent security. A robust platform like Coveo centralizes retrieval, indexing data across silos while ensuring secure, relevant, and scalable access. By grounding your LLMs with reliable retrieval, you streamline operations and enable consistent, enterprise-wide AI deployment.

Scalability comes from proven, enterprise-grade technology built for high volumes and complexity. Coveo’s AI-driven retrieval engine ensures secure, compliant, and reliable performance for millions of documents and users. With plug-and-play generative answers and custom retrieval APIs, partnering with Coveo reduces development time, cuts costs, and accelerates ROI of retrieval augmented generation projects, delivering impactful GenAI solutions in weeks.